This document describes fitting a hierarchical ordination to data, including code and the details that are needed.
The data is included in the mvabund R-package, but was originally collected by Gibb et al.. It includes abundance observations of 41 ant species at 30 sites in Australia, along with 7 environmental variables and 5 traits. The aim to is determine how the environmental variables and traits affect composition in the ecological community, including whether they interact. The methodological question is how does hierarchical ordination help.
The abundances, traits and environment are each stored in a different matrix. First we load the data and set some constants:
library(nimble)
library(nimbleHMC)
library(mvabund)
library(coda)
data("antTraits")
Y <- antTraits$abund # abundance data of sites by species
X <- scale(antTraits$env) # environmental variables of sites by predictors
qrX <- qr(X)
X <- X[,qrX$pivot[1:qrX$rank]]
TR <- scale(model.matrix(~0+.,antTraits$traits)) # species traits
qrTR <- qr(TR)
TR <- TR[,qrTR$pivot[1:qrTR$rank]]
NSites <- nrow(Y) # number of sites
NSpecies <- ncol(Y) # number of species
NTraits <- ncol(TR) # number of traits
NEnv <- ncol(X) # number of environmental predictors
# create data lists for Nimble
dat <- list(Y = Y, X = X, TR = TR)
consts <- list(NSites = NSites, NEnv = NEnv, NTraits = NTraits, NSpecies = NSpecies)
rm(NEnv, NTraits, NSpecies, NSites)
The data are counts of each species so we assume they follow a Poisson distribution with a log link function, as we would do in a standard generalised linear model. We assume that each species has a different mean abundance (i.e. for each species \(j\) we have a different intercept \(\beta_{0j}\)), and model the rest of the variation with a hierarchical ordination. This gives the following mean model on the link scale (with linear predictor \(\eta_{ij}\)):
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_i^\top \boldsymbol{\Sigma} \boldsymbol{\gamma}_j. \]
As with any ordination, \(\boldsymbol{z}\) and \(\boldsymbol{\gamma}_j\) are the site scores and species loadings, and the columns of \(\boldsymbol{Z}\) are the latent variables (which holds the site scores as the rows). Here we assume that they each have a variance of one, so that \(\boldsymbol{\Sigma}\) holds the variation of the latent variables: it will typically be a diagonal matrix, and if any of the terms on the diagonal are close to zero, this suggests that latent variable has (almost) no effect. It thus provides a straightforward summary of the relative importance of that latent variable, and is similar to the square root of eigenvalues (singular values) in an eigenanalysis.
If we stopped here, this would be a standard Generalized Linear Latent Variable Model. But, here we want to model both \(\boldsymbol{z}_i\) and \(\boldsymbol{\gamma}_j\), i.e. go from modelling model groups of species responding in similar ways to sites to modelling how the traits of the species affect their responses to the environment.
As a simplification, we can think about this as simply a regression of \(\boldsymbol{z}_i\) (the site effects) against the environmental variables and \(\boldsymbol{\gamma}_j\) against the traits. In reality, it is more complicated, because \(\boldsymbol{z}_i\) and \(\boldsymbol{\gamma}_j\) are estimated in the model, so their uncertainty needs to be propagated.
We denote the abundance at site \(i = 1 \dots n\) for species \(j = 1 \dots p\) as \(Y_{ij}\), or as a matrix as \(\boldsymbol{Y}\). The environmental variables are \(x_{ik}\) for the \(k = 1 \ldots K\) predictors (or \(\boldsymbol{X}\) as a matrix), and \(t = 1\ldots T\) traits as \(\boldsymbol{TR}\). We then assume
\[Y_{ij} \sim \text{Pois}(\lambda_{ij}),\]
with
\[\text{log}(\lambda_{ij}) = \eta_{ij}.\]
Consequently, \(\eta_{ij}\) is the linear predictor, which we further model with the hierarchical ordination:
\[ \eta_{ij} = \beta_{0j} + \boldsymbol{z}_i^\top \boldsymbol{\Sigma} \boldsymbol{\gamma}_j. \]
We then model \(\boldsymbol{z}_i\) (as in van der Veen et al. (2023)) and \(\boldsymbol{\gamma}_j\) hierarchically:
\[ \boldsymbol{z}_i = \boldsymbol{B}^\top\boldsymbol{x}_i + \boldsymbol{\epsilon}_i \] and
\[ \boldsymbol{\gamma}_j = \boldsymbol{\omega}^\top\boldsymbol{TR}_{j} + \boldsymbol{\varepsilon}_j \] where:
We fit the model with the Nimble R-package. We start with a single dimension for simplicity, so that we can show the steps needed.
# Function to run one chain: it can be done with HMC or other MCMC algorithms.
# "block" can be used to specify blocking structures for the slice sampler
# "slice" can be used to specify parameters on which to apply univariate Metropolis-Hastings
# parameters that are not in "block" or "slice" are HMCed
# Function to run a single chain
RunOneChain <- function(seed, dat, code, inits, consts, ToMonitor=NULL,
Nburn=5e3, NIter=5.5e4, Nthin=10, block = NULL, slice = NULL, ...) {
require(nimble)
require(nimbleHMC)
AllSamplers <- HMCsamplers <- c('epsilonSTAR', 'varepsilonSTAR', 'beta0', 'OSTAR', 'BSTAR',
'sd.SiteSTAR', 'sd.SpeciesSTAR','xi')
if(!is.null(block)){
HMCsamplers <- HMCsamplers[!HMCsamplers%in%unique(gsub("\\s*\\[[^\\)]+\\]","",c(unlist(block),unlist(slice))))]
}
if(is.null(ToMonitor)) {
ToMonitor <- c("beta0", "sd.Species", "sd.Site", "sd.LV", "B", "O",
"epsilon", "varepsilon", "gamma", "z", "xi")
}
mod <- nimbleModel(code = code, name = "HO", inits = inits(consts), constants = consts, data = dat, buildDerivs = TRUE)
model <- compileNimble(mod)
# Do HMC
conf <- configureHMC(model, nodes = HMCsamplers, monitors = ToMonitor, print = FALSE)
if(!is.null(block)) {
if(is.list(block)){
# Use a slice everything that not being HMCed
lapply(block, conf$addSampler, type = "AF_slice", sliceAdaptFactorInterval = Nburn)
}else{
# Use a slice everything that not being HMCed
sapply(block, conf$addSampler, type = "AF_slice", sliceAdaptFactorInterval = Nburn)
}
}
if(!is.null(slice)) {
if(is.list(slice)){
# Use a slice everything that not being HMCed
lapply(slice, conf$addSampler, type = "slice")
}else{
# Use a slice everything that not being HMCed
sapply(slice, conf$addSampler, type = "slice")
}
}
mcmc <- buildMCMC(conf)
cmcmc <- compileNimble(mcmc, project = model)
res <- runMCMC(cmcmc, niter=NIter, nburnin = Nburn, thin=Nthin,
nchains = 1, samplesAsCodaMCMC = TRUE, ...)
return(res)
}
# Function to run chains in parallel
ParaNimble <- function(NChains, ...) {
opts <- list(...)
if(!is.null(opts$seeds) && (length(opts$seeds) == NChains)){
seeds <- opts$seeds
opts <- opts[-which(names(opts)=="seeds")]
}else{
seeds <- 1:NChains
}
require(parallel)
nimble_cluster <- makeCluster(NChains)
clusterExport(nimble_cluster, "nimQR.u")
samples <- parLapply(cl = nimble_cluster, X = seeds, ...)
stopCluster(nimble_cluster)
# Name the chains in the list
chains <- setNames(samples,paste0("chain", 1:length(samples)))
chains
}
# Function to create list of names for parameters to use a block slice sampler for
MakeBlockList <- function(consts, LVwise = TRUE){
# builds list for slice AF sampling
# with LVwise = TRUE we are LV-wise blocking B and epsilon, and O and varepsilon
# otherwise we block across LVs but per parameter
if(LVwise & consts$nLVs>1){
blockList <- c(sapply(1:consts$nLVs,
function(x,consts){
c(paste0("BSTAR[", 1:consts$NEnv,", ",x, "]"),
paste0("epsilonSTAR[", 1:consts$NSites,", ",x, "]"))
}
,
consts = consts,simplify=F),
sapply(1:consts$nLVs,
function(x,consts){
c(paste0("OSTAR[", 1:consts$NTraits,", ",x, "]"),
paste0("varepsilonSTAR[", 1:consts$NSpecies,", ",x, "]"))
}
,
consts = consts,simplify=F)
)
}else{
#blockList = list("BSTAR","epsilonSTAR","OSTAR","varepsilonSTAR")
blockList = list(c("BSTAR","epsilonSTAR"),c("OSTAR","varepsilonSTAR"))
}
blockList
}
GetMeans <- function(summ, name, d) {
var <- summ$statistics[grep(paste0('^', name), rownames(summ$statistics)),"Mean"]
matrix(var,ncol=d)
}
# Utility Function to get logical indicators for if names contains a string in v
GetInds <- function(v, names) {
if(length(v)==1) {
res <- grep(v, names)
} else {
res <- c(unlist(sapply(v, grep, x=names)))
}
res
}
# Function to swap signs of all variables varinds to have same sign as vartosign.
ReSignChain <- function(chain, varinds, vartosign) {
# MeanSign <- sign(mean(chain[ ,s])) # Might need this
res <- t(apply(chain, 1, function(x, vs, vi) {
Names <- names(x)[vi]
if(any(grepl(",", Names))) {
lvind <- gsub(".*, ", "", Names)
} else {
lvind <- seq_along(vs)
}
x[vi] <- x[vi]*sign(x[vs[lvind]])
x
}, vs=vartosign, vi=varinds))
as.mcmc(res)
}
# Function to post-process chains to swap signs.
postProcess <- function(Chains, VarsToProcess, VarsToSwapBy = NULL, VarToSign=NULL, print=FALSE, rule = 2) {
if(is.null(VarToSign)) VarToSign <- VarsToProcess
SignInd <- GetInds(VarToSign, colnames(Chains[[1]]))
# Get indicators for all variables to have their signs changed
ProcessInds <- GetInds(VarsToProcess, names = colnames(Chains[[1]]))
# Check if > 1 LV
SeveralLVs <- any(grepl(",", colnames(Chains[[1]])[SignInd]))
LV <- gsub(".*, ", "", colnames(Chains[[1]])[SignInd])
if(rule==1){
# Calculate variance of mean of indicator of sign:
# hopefully largest is variable with most sign swapping (i.e. )
Signs <- as.data.frame(lapply(Chains, function(mat, ind) {
colMeans(mat[,ind]>0)
}, ind=SignInd))
VarSign <- apply(Signs, 1, var)
if(SeveralLVs) {
LV <- gsub(".*, ", "", colnames(Chains[[1]])[SignInd])
#Chose variables who's sign will be used to swap other signs
if(is.null(VarsToSwapBy)){
VarsToSwapBy <- sapply(unique(LV), function(lv, vs) {
vv <- vs[grep(lv, names(vs))]
nm <- names(which(vv==max(vv)))
if(length(nm)>1) nm <- nm[1] # probably something more sophisticated is better
nm
}, vs = VarSign, simplify = TRUE)
}
} else { # only 1 LV
if(is.null(VarsToSwapBy)){
#Chose variables who's sign will be used to swap other signs
VarsToSwapBy <- names(which(VarSign==max(VarSign)))[1]
}
}
}else if(rule==2){
require(mousetrap)
jointChains <- do.call(rbind, Chains)
# bimodality score
bms<-apply(jointChains[,SignInd],2,bimodality_coefficient)
# find maximum bimodality score
if(SeveralLVs){
lstSgn <- sapply(unique(LV),function(lv)which.max(bms[grep(lv,names(bms))]),simplify=F)
# formatting
names(lstSgn) <- NULL
VarsToSwapBy <- names(unlist(lstSgn))
names(VarsToSwapBy) <- paste0(sort(unique(LV)))
}else{
lstSgn <- which.max(bms)
# formatting
VarsToSwapBy <- names(lstSgn)
names(VarsToSwapBy) <- "1]"
}
}
if(print) message(paste0("Swapping by ", paste(VarsToSwapBy, collapse=", ")))
chains.sgn <- lapply(Chains, ReSignChain, vartosign=VarsToSwapBy, varinds=ProcessInds)
as.mcmc.list(chains.sgn)
}
# Function to convert a variable in an MCMC chain that should be a matrix from
# a vector to the right matrix
ChainToMatrix <- function(ch, name) {
v <- ch[grep(paste0("^", name, "\\["), names(ch))]
l.row <- as.numeric(gsub(".*\\[", "", gsub(",.*", "", names(v))))
l.col <- as.numeric(gsub(".*, ", "", gsub("\\]", "", names(v))))
mat <- matrix(0, nrow=max(l.row), ncol=max(l.col))
mat[l.row+max(l.row)*(l.col-1)]<-v
mat
}
# Function to convert a variable in an MCMC chain that should be a vector to a diagonal matrix
ChainToDiag <- function(ch, name) {
v <- ch[grep(paste0("^", name, "\\["), names(ch))]
mat <- diag(v)
mat
}
RescaleVars <- function(vec, ScaleBy, SDTorescale,ToRescale) {
if(!any(c("z","gamma")%in%ScaleBy))stop("Not a valid choice.")
if("sd.LV"%in%SDTorescale)stop("Not a valid choice.")
vec2 <- vec
# get scale from z or gamma
ScBy <- ChainToMatrix(vec, ScaleBy)
SD <- apply(ScBy, 2, sd)
for(nm in unique(c(ScaleBy, ToRescale))) {
Sc.x <- ChainToMatrix(vec, nm)
Sc.x.sc <- sweep(Sc.x, 2, SD, "/")
vec2[grep(paste0("^", nm, "\\["), names(vec2))] <- c(Sc.x.sc)
}
# scale sd separately
for(nm in SDTorescale) {
SD.x <- vec[grep(paste0("^", nm, "\\["), names(vec))]
vec2[grep(paste0("^", nm, "\\["), names(vec2))] <- SD.x/SD
}
# scale into sd.LV
# sd.LV <- vec[grep("sd.LV", names(vec))]
# vec2[grep("sd.LV", names(vec2))] <- SD*sd.LV
vec2
}
RescaleChains <- function(mcmc.lst, ...) {
rescale <- lapply(mcmc.lst, function(mcmc) {
rot <- apply(mcmc, 1, RescaleVars, ...)
as.mcmc(t(rot))
})
as.mcmc.list(rescale)
}
#
# # Function to process B and O to orthogonal columns
# processChainPredEffs <- function(ch){
# B <- ChainToMatrix(ch, "B")
# O <- ChainToMatrix(ch, "O")
# do.svd.B <- svd(B)
# do.svd.O <- svd(O)
# B <- B%*%do.svd.B$v
# O <- O%*%do.svd.O$v
# ch[grep(paste0("^", "B", "\\["), names(ch))] <- c(B)
# ch[grep(paste0("^", "O", "\\["), names(ch))] <- c(O)
# ch
# }
#
# processPredEffs <- function(mcmc.lst) {
# rotated <- lapply(mcmc.lst, function(mcmc) {
# rot <- apply(mcmc, 1, processChainPredEffs)
# as.mcmc(t(rot))
# })
# as.mcmc.list(rotated)
# }
inits <- function(consts){
B = matrix(rnorm(consts$nLVs*consts$NEnv),ncol=consts$nLVs)
O = matrix(rnorm(consts$nLVs*consts$NTraits),nrow=consts$NTraits)
varepsilon = mvtnorm::rmvnorm(consts$NSpecies,rep(0,consts$nLVs),diag(consts$nLVs))
epsilon = mvtnorm::rmvnorm(consts$NSites,rep(0,consts$nLVs),diag(consts$nLVs))
xi = rgamma(1,4,3)
#for(l in 2:consts$nLVs)xi<-c(xi,rbeta(1,consts$NSpecies/(l+consts$NSpecies),l))
xi <- c(xi,rbeta(consts$nLVs-1,5,5))
list(
BSTAR = B,
OSTAR = O,
epsilonSTAR = epsilon,
varepsilonSTAR = varepsilon,
sd.SiteSTAR = rexp(consts$nLVs),
sd.SpeciesSTAR = rexp(consts$nLVs),
beta0 = rnorm(consts$NSpecies),
# sd.LV = rexp(consts$nLVs),
xi = xi
)
}
PlotPost <- function(var, summ, varnames=NULL, ...) {
vars <- grep(var, rownames(summ$statistics))
if(is.null(varnames)) varnames <- rownames(summ$statistics)[grep(var, rownames(summ$statistics))]
if(length(varnames)!=length(vars))
stop(paste0("Number of variable names, ", length(varnames), "not the same as number of variables, ",
length(vars)))
plot(summ$statistics[vars,"Mean"], 1:length(vars),
xlim=range(summ$quantiles[vars,]), yaxt="n", ...)
segments(summ$quantiles[vars,"2.5%"], 1:length(vars), summ$quantiles[vars,"97.5%"], 1:length(vars))
segments(summ$quantiles[vars,"25%"], 1:length(vars), summ$quantiles[vars,"75%"], 1:length(vars), lwd=3)
abline(v=0, lty=3)
axis(2, at=1:length(vars), labels=varnames, las=1)
}
AddArrows <- function(coords, marg= par("usr"), col=2) {
origin <- c(mean(marg[1:2]), mean(marg[3:4]))
Xlength <- sum(abs(marg[1:2]))/2
Ylength <- sum(abs(marg[3:4]))/2
ends <- coords / max(abs(coords)) * min(Xlength, Ylength) * .8
arrows(
x0 = origin[1],
y0 = origin[2],
x1 = ends[,
1] + origin[1],
y1 = ends[, 2] + origin[2],
col = col,
length = 0.1)
text(
x = origin[1] + ends[, 1] * 1.1,
y = origin[2] + ends[, 2] * 1.1,
labels = rownames(coords),
col = col)
}
Latent variable models are notorious for being unidentifiable, you can get the same mean abundances from different combinations of the parameters. We have to make some adjustments to the model to account for this: some of this is done in the model fitting, but for others it is easier to do it after we obtain the posterior samples.
We want to make one parameter positive, so ideally we want to do this to a parameter that has both modes away from zero. We can identify this in an ad hoc way: for each chain for every parameter we calculate the proportion of iterations where the sign is positive, and then for every parameter we calculate the variance in that proportion. If a parameter is centered around 0 the mean proportion will be about 0.5, and will not vary much, whereas if it is some way from 0 the mean will be close to 0 or 1, and the variance will be high. Alternatively, we could look at the largest \(|p - 0.5|\), or we can visually identify such parameters from the posterior samples. There may be even better alternatives.
We could still estimate some but not all latent variables, for example in larger examples and to speed up computation. The maximum number of latent variables that can be informed (see van der Veen et al. (2023)) by the predictor variables is determined by the number of traits and environmental predictors that we have. In essence, we cannot have more latent variables that the minimum number of traits and environmental predictors, because at that point we have reached the maximum amount of information that (one of, or the combination of) the matrices can explain in the responses (i.e., one of the matrices alone could explain more variation still). In the example here, there are five environmental predictors, but seven traits. Thus, the maximum number of latent variables that can be informed by the matrices jointly is five, though two more dimensions could be informed by the traits alone.
# Model code
# Update our constants from before with the new number of LVs, rest remains the same
HO.nim <- nimbleCode({
for (i in 1:NSites) {
for (j in 1:NSpecies) {
Y[i, j] ~ dpois(lambda[i, j])
}
}
ones[1:NSites] <- rep(1, NSites)
# linear predictor
log(lambda[1:NSites, 1:NSpecies]) <- eta[1:NSites, 1:NSpecies]
eta[1:NSites,1:NSpecies] <- asCol(ones[1:NSites])%*%asRow(beta0[1:NSpecies]) +z[1:NSites,1:nLVs]%*%Sigma[1:nLVs,1:nLVs]%*%t(gamma[1:NSpecies,1:nLVs])
Sigma[1:nLVs, 1:nLVs] <- diag(sd.LV[1:nLVs])
# sites
XB[1:NSites,1:nLVs] <- X[1:NSites,1:NEnv]%*%B.ort[1:NEnv,1:nLVs]
zSTAR[1:NSites,1:nLVs] <- XB[1:NSites,1:nLVs] + epsilonSTAR[1:NSites,1:nLVs]%*%diag(sd.SiteSTAR[1:nLVs])
# species
gammaSTAR[1:NSpecies,1:nLVs] <- omegaTR[1:NSpecies,1:nLVs] + varepsilonSTAR[1:NSpecies,1:nLVs]%*%diag(sd.SpeciesSTAR[1:nLVs])
omegaTR[1:NSpecies,1:nLVs] <- TR[1:NSpecies,1:NTraits]%*%O.ort[1:NTraits,1:nLVs]
# prior for intercept
beta0[1:NSpecies] ~ dmnorm(zeroNSpecies[1:NSpecies], prec = diagNSpecies[1:NSpecies,1:NSpecies])
# prior for LV scales
xi[1] ~ dgamma(5,2)
for(l in 2:nLVs) {
xi[l] ~ dbeta(1, 10)#NSpecies/(2*l+NSpecies), l^2) #dbeta(1,20)# could instead be dunif(0,1) to be less informative
}
# diagonal matrices and such for use in priors
diagNTraits[1:NTraits,1:NTraits] <- diag(NTraits)
diagNEnv[1:NEnv,1:NEnv] <- diag(NEnv)
zeroNTraits[1:NTraits] <- rep(0, NTraits)
zeroNEnv[1:NEnv] <- rep(0, NEnv)
diagNSites[1:NSites,1:NSites] <- diag(NSites)
diagNSpecies[1:NSpecies,1:NSpecies] <- diag(NSpecies)
zeroNSites[1:NSites] <- rep(0, NSites)
zeroNSpecies[1:NSpecies] <- rep(0, NSpecies)
for(l in 1:nLVs) {
# prors for LV-level errors
epsilonSTAR[1:NSites,l] ~ dmnorm(zeroNSites[1:NSites], prec = diagNSites[1:NSites,1:NSites])# Residual
varepsilonSTAR[1:NSpecies,l] ~ dmnorm(zeroNSpecies[1:NSpecies], prec = diagNSpecies[1:NSpecies,1:NSpecies]) # Residual
# priors for predictor effects
OSTAR[1:NTraits,l] ~ dmnorm(zeroNTraits[1:NTraits], prec = diagNTraits[1:NTraits,1:NTraits])
BSTAR[1:NEnv,l] ~ dmnorm(zeroNEnv[1:NEnv], prec = diagNEnv[1:NEnv,1:NEnv])
# priors for scale parameters
sd.SiteSTAR[l] ~ dexp(1)
sd.SpeciesSTAR[l] ~ dexp(1)
# constructing LV scale
sd.LV[l] <- prod(xi[1:l])
## standardizing z and gamma for identifiability
## and B and O to have non-unit norms
## rescle the rest of the parameters for trace plots and such
## these are the quantities that really go into eta
#norm.B[l] <- sqrt(sum(BSTAR[1:NEnv,l]^2))
#norm.O[l] <- sqrt(sum(OSTAR[1:NTraits,l]^2))
StdDev.z[l] <- sd(zSTAR[1:NSites,l])
z[1:NSites,l] <- zSTAR[1:NSites,l]/StdDev.z[l]
epsilon[1:NSites,l] <- epsilonSTAR[1:NSites,l]/StdDev.z[l]
B[1:NEnv,l] <- B.ort[1:NEnv,l]/StdDev.z[l]#*norm.B[l]
sd.Site[l] <- sd.SiteSTAR[l]/StdDev.z[l]
StdDev.gamma[l] <- sd(gammaSTAR[1:NSpecies,l])
gamma[1:NSpecies,l] <- gammaSTAR[1:NSpecies,l]/StdDev.gamma[l]
varepsilon[1:NSpecies,l] <- varepsilonSTAR[1:NSpecies,l]/StdDev.gamma[l]
O[1:NTraits,l] <- O.ort[1:NTraits,l]/StdDev.gamma[l]#*norm.O[l]
sd.Species[l] <- sd.SpeciesSTAR[l]/StdDev.gamma[l]
}
# Orthogonalize B and O for identifiability
# first predictor effects B by its F norm
# B.nn[1:NEnv,1:nLVs] <- BSTAR[1:NEnv,1:nLVs]%*%diag(1/norm.B[1:nLVs])
# O.nn[1:NTraits,1:nLVs] <- OSTAR[1:NTraits,1:nLVs]%*%diag(1/norm.O[1:nLVs])
# # get Cholesky factor
# L.B[1:nLVs,1:nLVs] <- chol(t(B.nn[1:NEnv,1:nLVs])%*%B.nn[1:NEnv,1:nLVs])
# L.O[1:nLVs,1:nLVs] <- chol(t(O.nn[1:NTraits,1:nLVs])%*%O.nn[1:NTraits,1:nLVs])
# # Orthogonalize predictor effects
# B.ort[1:NEnv,1:nLVs] <- t(forwardsolve(t(L.B[1:nLVs,1:nLVs]),t(B.nn[1:NEnv,1:nLVs])))
# O.ort[1:NTraits,1:nLVs] <- t(forwardsolve(t(L.O[1:nLVs,1:nLVs]),t(O.nn[1:NTraits,1:nLVs])))
B.ort[1:NEnv,1:nLVs] <- nimQR.u(BSTAR[1:NEnv,1:nLVs], nLVs)
O.ort[1:NTraits,1:nLVs] <- nimQR.u(OSTAR[1:NTraits,1:nLVs],nLVs)
})
# QR schwarz-Rutishauser
# The algorithm is correct, but I cannot use it with HMC..
nimQR.u <- nimbleFunction(
run = function(A = double(2), p = double()) {
for (k in 2:p) {
A[, k] <- A[, k]
for (i in 1:(k-1)) {
A[, k] <- A[, k] - inprod(A[,i], A[,k]) / inprod(A[, i], A[, i]) * A[, i]
}
}
return(A)
returnType(double(2))
} ,
buildDerivs = list(run = list(ignore = c("k", "i","p")))
)
And now we can run the model:
consts$nLVs <- nLVs <- min(ncol(dat$X),ncol(dat$TR)) # = 5LVs
HO.LV <- ParaNimble(4, fun = RunOneChain,
dat = dat,
code = HO.nim,
inits = inits,
# Nburn=5e1, NIter=5.5e2, Nthin=1, # for a small run
Nburn = 1000, NIter = 8e3, Nthin = 1, # for a full run
consts = consts, block = MakeBlockList(consts, LVwise = F), slice = c(paste0("beta0[",1:consts$NSpecies,"]")))
## Loading required package: parallel
VarsToProcess <- c("^B", "^O", "^epsilon", "^varepsilon", "^z", "^gamma")
#no need to rescale here, the quantities we track are already rescaled.
# # Re-scale gamma
# chains2LV.sc <- RescaleChains(HO.LV, ScaleBy = "gamma",
# SDTorescale = c("sd.Species"),
# ToRescale = c("gamma", "O.ort", "varepsilon"))
#
# # Re-scale Z
# chains2LV.sc2 <- RescaleChains(chains2LV.sc, ScaleBy = "z",
# SDTorescale = c("sd.Site"),
# ToRescale = c("z", "B.ort", "epsilon"))
# # Process B and O for orthogonality
# chains2LV.ort <- processPredEffs(chains2LV.sc2)
# post-process chains for sign-swapping
chains2LV <- postProcess(HO.LV, VarsToProcess = VarsToProcess, rule = 2, print = T)
## Loading required package: mousetrap
## Welcome to mousetrap 3.2.1!
## Summary of recent changes: http://pascalkieslich.github.io/mousetrap/news/
## Forum for questions: https://forum.cogsci.nl/index.php?p=/categories/mousetrap
## Swapping by z[22, 1], z[9, 2], epsilon[30, 3], epsilon[29, 4], z[1, 5]
# Calculate summaries
summ.HOLV <- summary(chains2LV)
library(basicMCMCplots)
chainsPlot(chains2LV, var = c("sd.LV"), legend = F, traceplot=TRUE)
We can also plot these as a sort of “screeplot”, to see how the variation of the LVs decreases with the number of dimensions:
par(mfrow=c(1,2))
plot(NA,ylim=c(0,max(rbind(summ.HOLV$statistics[grep("sd.LV", row.names(summ.HOLV$statistics)),1],summ.HOLV$quantiles[grep("sd.LV", row.names(summ.HOLV$statistics)),1],summ.HOLV$statistics[grep("sd.LV", row.names(summ.HOLV$statistics)),4]))+0.1),xlim=c(1,5),xlab="Latent variable",ylab="SD")
lines(summ.HOLV$statistics[grep("sd.LV", row.names(summ.HOLV$statistics)),1],type="b",xlab="Latent variable",ylab="SD")
lines(summ.HOLV$quantiles[grep("sd.LV", row.names(summ.HOLV$statistics)),1],type="b",xlab="Latent variable",ylab="SD",col="red",lty="dashed")
lines(summ.HOLV$quantiles[grep("sd.LV", row.names(summ.HOLV$statistics)),4],type="b",xlab="Latent variable",ylab="SD",col="red",lty="dashed")
plot(NA,ylim=c(0,max(rbind(summ.HOLV$statistics[grep("xi", row.names(summ.HOLV$statistics)),1],summ.HOLV$quantiles[grep("xi", row.names(summ.HOLV$statistics)),1],summ.HOLV$statistics[grep("xi", row.names(summ.HOLV$statistics)),4]))+0.1),xlim=c(1,5),xlab="Latent variable",ylab="xi")
lines(summ.HOLV$statistics[grep("xi", row.names(summ.HOLV$statistics)),1],type="b",xlab="Latent variable",ylab="xi")
lines(summ.HOLV$quantiles[grep("xi", row.names(summ.HOLV$statistics)),1],type="b",xlab="Latent variable",ylab="xi",col="red",lty="dashed")
lines(summ.HOLV$quantiles[grep("xi", row.names(summ.HOLV$statistics)),4],type="b",xlab="Latent variable",ylab="xi",col="red",lty="dashed")
Environment first:
library(basicMCMCplots)
chainsPlot(chains2LV, var = c("B"), legend = F, traceplot=TRUE)
And the trait effects:
chainsPlot(chains2LV, var = c("O"), legend = F, traceplot=TRUE)